Live freelance tracking. Raw descriptions turned into structured data. Find your next tech project without the noise.
upwork.com π’ 2026-06-01
πΉ Local Supplier Directory Crawler
π€ Client: πΊπΈ United States Member since 2012-11-22
π° Price: ****
π© Problem: Need for a portable, local-execution scraper to extract structured manufacturer data from public directories without reliance on third-party SaaS platforms.
π¦ Existing: Dedicated scraping machine
Specifications:
[Target] Public supplier and manufacturer directories
[Method] Seed URL/Keyword-based crawling
[Stack] Python or Node.js (Playwright, Puppeteer, Scrapy, Crawlee, BeautifulSoup, or Selenium)
[Format] CSV, JSON
[Security] Proxy configuration, Timeout settings, Crawl delay
[Logic] Deduplication (Domain, Company Name, Phone, Address)
[Logic] Resume capability for interrupted jobs
[Logic] Logging (Success, Failures, Blocks, Timeouts, Duplicates)
[UI/UX] Config file (Input/Output paths, Max records, Max pages)
[UI/UX] README (Installation, Execution, Input/Output formatting, Troubleshooting)
Workflow:
1. Input processing of categories and seed URLs via config file
2. Web crawling and structured data extraction of company records
3. Application of deduplication logic and error handling/retries
4. Export of validated records to CSV and JSON formats
5. Generation of execution logs and state tracking for resume functionality